This analysis addresses feedback from Francis regarding the temporal alignment of weather predictors with butterfly responses. The original daily-level analysis used fixed 6am-6pm weather windows, which Francis identified as having a temporal logic issue:
“All of the metrics for wind, temperature and light, would need to be re-calculated for 24 hour periods that begin at the time of the highest count.”
1.1 Francis’s Key Points
Temporal alignment: Butterflies can only respond to weather after it occurs
Biological timing: If max count occurred at 2pm on day t-1, the relevant weather window should start at 2pm, not 6am
Roosting decisions: Weather from max count through sunset determines whether butterflies abandon the roost
Weather metrics now include overnight conditions (24/7 temperature and wind)
Window length is exactly 24 hours for all observations
Captures weather exposure over a standardized period following peak count
Tests the hypothesis: “Do weather conditions in the 24 hours following peak count predict roost abandonment?”
Data source: data/monarch_daily_lag_analysis_24hr_window.csv
Comparison to sunset window: This analysis uses a fixed 24-hour window rather than a variable-length window ending at functional sunset. This allows comparison between standardized vs. biologically-defined window endpoints.
We evaluate three butterfly difference metrics (max, 95th percentile, and top 3 mean) with three transformations each (untransformed, square root, and square) to determine which best approximates normality.
**Best transformation for normality:** butterfly_diff_sqrt
(W = 0.9885 , p = 0.588 )
Code
# Create histogramspar(mfrow =c(3, 3), mar =c(4, 4, 3, 1))for (var in response_vars) { var_data <- data_filtered[[var]] var_data <- var_data[!is.na(var_data)]# Histogramhist(var_data,breaks =30, probability =TRUE,main =sprintf("%s\n(W=%.3f, p=%.4f)", var, normality_tests$shapiro_stat[normality_tests$variable == var], normality_tests$p_value[normality_tests$variable == var] ),xlab ="Value", col ="steelblue", border ="black" )# Overlay normal distribution x_seq <-seq(min(var_data), max(var_data), length.out =100)lines(x_seq, dnorm(x_seq, mean(var_data), sd(var_data)),col ="red", lwd =2 )# Add gridgrid()}
4.1 Key Findings
Square root transformations perform best, with all three failing to reject normality (p > 0.5)
Untransformed differences show significant departures from normality (p < 0.0001)
Square transformations severely violate normality with extreme kurtosis (>9)
butterfly_diff_sqrt has the highest Shapiro-Wilk statistic (W = 0.9893, p = 0.6368)
Square root transformations achieve near-zero skewness and kurtosis close to 0
Selected response variable for modeling:butterfly_diff_sqrt (square root of max butterflies difference)
5 Candidate Predictor Variables
Following the original analysis structure, we consider all potential weather and baseline metrics, then use correlation analysis to select final predictors.
var_descriptions <-tribble(~Variable, ~Description, ~Type,"butterflies_95th_percentile_t_1", "95th percentile count on previous day (baseline)", "Baseline","max_butterflies_t_1", "Maximum count on previous day", "Baseline","butterflies_top3_mean_t_1", "Mean of top 3 counts on previous day", "Baseline","temp_max", "Max temp in 24hr window (includes overnight)", "Temperature","temp_min", "Min temp in 24hr window (includes overnight)", "Temperature","temp_mean", "Mean temp in 24hr window", "Temperature","temp_at_max_count_t_1", "Temperature when max count occurred", "Temperature","hours_above_15C", "Hours ≥15°C in window", "Temperature","degree_hours_above_15C", "Cumulative degree-hours >15°C", "Temperature","wind_avg_sustained", "Mean sustained wind speed in window", "Wind","wind_max_gust", "Maximum gust in 24hr window (includes overnight)", "Wind","wind_gust_sum", "Sum of all gust measurements", "Wind","wind_gust_sum_above_2ms", "Sum of gusts >2 m/s", "Wind","wind_gust_hours", "Gust-hours (integral)", "Wind","wind_minutes_above_2ms", "Minutes with wind ≥2 m/s", "Wind","wind_gust_sd", "SD of gust speeds (variability)", "Wind","wind_mode_gust", "Most frequent gust speed", "Wind","sum_butterflies_direct_sun", "Sum of butterflies in direct sun (24hr window)", "Sun","lag_duration_hours", "Window length in hours (fixed at 24)", "Window")kable(var_descriptions, caption ="Candidate predictor variables")
Candidate predictor variables
Variable
Description
Type
butterflies_95th_percentile_t_1
95th percentile count on previous day (baseline)
Baseline
max_butterflies_t_1
Maximum count on previous day
Baseline
butterflies_top3_mean_t_1
Mean of top 3 counts on previous day
Baseline
temp_max
Max temp in 24hr window (includes overnight)
Temperature
temp_min
Min temp in 24hr window (includes overnight)
Temperature
temp_mean
Mean temp in 24hr window
Temperature
temp_at_max_count_t_1
Temperature when max count occurred
Temperature
hours_above_15C
Hours <U+2265>15<U+00B0>C in window
Temperature
degree_hours_above_15C
Cumulative degree-hours >15<U+00B0>C
Temperature
wind_avg_sustained
Mean sustained wind speed in window
Wind
wind_max_gust
Maximum gust in 24hr window (includes overnight)
Wind
wind_gust_sum
Sum of all gust measurements
Wind
wind_gust_sum_above_2ms
Sum of gusts >2 m/s
Wind
wind_gust_hours
Gust-hours (integral)
Wind
wind_minutes_above_2ms
Minutes with wind <U+2265>2 m/s
Wind
wind_gust_sd
SD of gust speeds (variability)
Wind
wind_mode_gust
Most frequent gust speed
Wind
sum_butterflies_direct_sun
Sum of butterflies in direct sun (24hr window)
Sun
lag_duration_hours
Window length in hours (fixed at 24)
Window
6 Data Quality Assessment
Code
cat("Data completeness summary:\n")
Data completeness summary:
Code
cat("- Observations with metrics_complete = 0:", sum(daily_data$metrics_complete ==0), "\n")
- Observations with metrics_complete = 0: 2
Code
cat("- Observations with wind_data_coverage < 0.5:", sum(daily_data$wind_data_coverage <0.5), "\n")
- Observations with wind_data_coverage < 0.5: 5
Code
cat("- Mean temperature coverage:", round(mean(daily_data$temp_data_coverage), 3), "\n")
- Mean temperature coverage: 1
Code
cat("- Mean wind coverage:", round(mean(daily_data$wind_data_coverage), 3), "\n")
- Mean wind coverage: 0.954
Code
cat("- Mean butterfly coverage:", round(mean(daily_data$butterfly_data_coverage), 3), "\n\n")
- Mean butterfly coverage: 0.919
Code
# Show observations with low wind coveragelow_wind <- daily_data %>%filter(wind_data_coverage <0.5) %>%select( deployment_id, date_t_1, date_t, wind_data_coverage, butterflies_95th_percentile_t_1, butterflies_95th_percentile_t )if (nrow(low_wind) >0) {cat("Observations with <50% wind coverage:\n")kable(low_wind,caption ="Low wind data coverage (likely wind database gaps, not zero-butterfly exclusions)",digits =3 )} else {cat("All observations have adequate wind coverage\n")}
Observations with <50% wind coverage:
Low wind data coverage (likely wind database gaps, not zero-butterfly exclusions)
deployment_id
date_t_1
date_t
wind_data_coverage
butterflies_95th_percentile_t_1
butterflies_95th_percentile_t
SC10
2024-01-27
2024-01-28
0.269
0.0
6.70
SC10
2024-01-28
2024-01-29
0.042
6.7
14.70
SC10
2024-01-29
2024-01-30
0.000
14.7
9.00
SC10
2024-01-30
2024-01-31
0.000
9.0
7.95
SC9
2024-01-28
2024-01-29
0.042
18.3
1.00
Note: Similar to the sunset window analysis, observations with limited wind data coverage likely reflect gaps in the wind database rather than data preparation issues.
6.1 Filtering Low Quality Observations
Code
# Filter observations with metrics_complete < 0.95low_quality <- daily_data %>%filter(metrics_complete <0.95)cat("Observations to exclude (metrics_complete < 0.95):", nrow(low_quality), "\n")
Observations to exclude (metrics_complete < 0.95): 9
Code
cat("Percentage of dataset:", round(nrow(low_quality) /nrow(daily_data) *100, 1), "%\n\n")
Percentage of dataset: 8.7 %
Code
if (nrow(low_quality) >0) {cat("Excluded observations have:\n")cat("- Mean butterflies_95th_t_1:", round(mean(low_quality$butterflies_95th_percentile_t_1), 1), "\n")cat("- Mean butterflies_95th_t:", round(mean(low_quality$butterflies_95th_percentile_t), 1), "\n")cat("- Mean |butterfly_diff_95th|:", round(mean(abs(low_quality$butterfly_diff_95th)), 1), "\n")cat("- Mean metrics_complete:", round(mean(low_quality$metrics_complete), 3), "\n\n")}
Excluded observations have:
- Mean butterflies_95th_t_1: 28
- Mean butterflies_95th_t: 25.7
- Mean |butterfly_diff_95th|: 11.6
- Mean metrics_complete: 0.56
Full models (2): All main effects + all 10 two-way interactions
Total: 76 models comprehensively testing all meaningful combinations of the 5 selected weather predictors.
8.3 Functional Form Comparison
For baseline control (max_butterflies_t_1), every model is fitted in two versions: - Smooth baseline: s(max_butterflies_t_1, k=5) - Linear baseline: Linear term
AIC determines optimal functional form for each predictor combination.
8.4 Model Naming Convention
Models sequentially numbered M1 through M76, systematically covering: - Increasing complexity: null → single → interactions → additive → complex → full - Each complexity level includes both smooth and linear baseline versions - Model descriptions track predictor combinations and baseline type
Code
# Load required library for AR1library(nlme)# Prepare datamodel_data <- daily_data %>%filter(metrics_complete >=0.95) %>%arrange(deployment_id, observation_order_t) %>%mutate(deployment_id =factor(deployment_id),# Ensure all predictors are numericacross(c( max_butterflies_t_1, temp_min, temp_max, temp_at_max_count_t_1, wind_max_gust, wind_mode_gust, sum_butterflies_direct_sun ), as.numeric) ) %>%# Remove any rows with NA in key variablesfilter(!is.na(butterfly_diff_sqrt),!is.na(max_butterflies_t_1) )cat("Model data:", nrow(model_data), "observations\n")
cat("Note: lag_duration_hours is constant (24 hours) and excluded from models\n\n")
Note: lag_duration_hours is constant (24 hours) and excluded from models
Code
# Define random effects structure with temporal autocorrelationrandom_structure <-list(deployment_id =~1)# Define correlation structures to testcorrelation_structures <-list("no_corr"=NULL, # No temporal correlation"AR1"=corAR1(form =~ observation_order_t | deployment_id) # AR1 within deployments)
9 Model Fitting
Code
# Initialize model list and trackingmodels <-list()model_descriptions <-list()# Use AR1 correlation structure from defined structuresar1_cor <- correlation_structures$AR1# Set k value for baseline smoothk_baseline <-5# Define weather predictorsweather_predictors <-c("temp_min", "temp_max", "temp_at_max_count_t_1","wind_max_gust", "sum_butterflies_direct_sun")# Helper function to fit model with error handlingfit_model_safe <-function(formula_str, baseline_type, model_name, description) {tryCatch( { model <-gamm(as.formula(formula_str),data = model_data,random = random_structure,correlation = ar1_cor,method ="REML" ) models[[model_name]] <<- model model_descriptions[[model_name]] <<- descriptioncat(sprintf("✓ %s: %s\n", model_name, description))return(TRUE) },error =function(e) {cat(sprintf("✗ %s failed: %s\n", model_name, e$message))return(FALSE) } )}# ============================================================================# PART 1: NULL MODELS (2 models)# ============================================================================cat("\n=== FITTING NULL MODELS ===\n")
# Linear baselinefit_model_safe("butterfly_diff_sqrt ~ max_butterflies_t_1","linear", "M2", "Null (linear baseline)")
<U+2713> M2: Null (linear baseline)
[1] TRUE
Code
# ============================================================================# PART 2: SINGLE PREDICTOR MODELS (10 models)# ============================================================================cat("\n=== FITTING SINGLE PREDICTOR MODELS ===\n")
# ============================================================================# PART 3: TWO-WAY INTERACTION MODELS (20 models)# Only interactions, no main effects (tests pure interaction)# ============================================================================cat("\n=== FITTING TWO-WAY INTERACTION MODELS (interactions only) ===\n")
# ============================================================================# PART 4: MAIN EFFECTS + INTERACTION MODELS (20 models)# Both main effects + their interaction# ============================================================================cat("\n=== FITTING MAIN EFFECTS + INTERACTION MODELS ===\n")
<U+2713> M70: All additive + all temp<U+00D7>wind interactions (linear)
[1] TRUE
Code
model_num <- model_num +1# ============================================================================# PART 7: FULL MODELS (2 models)# All main effects + ALL two-way interactions# ============================================================================cat("\n=== FITTING FULL MODELS (all terms + all interactions) ===\n")
=== FITTING FULL MODELS (all terms + all interactions) ===
# Save full comparison tablewrite_csv(model_comparison, here("analysis", "dynamic_window_analysis", "model_comparison_24hr.csv"))cat("Full model comparison saved to: analysis/dynamic_window_analysis/model_comparison_24hr.csv\n")
Full model comparison saved to: analysis/dynamic_window_analysis/model_comparison_24hr.csv
11 Cross-Validation: Overfitting Check
To validate whether the top models are truly better or just overfitting, we perform leave-one-deployment-out cross-validation (LODOCV) on the top 10 models by AICc.
Code
# Select top 10 models for cross-validationtop_10_models <- model_comparison %>%head(10) %>%pull(model)cat("Performing LODOCV on top 10 models...\n\n")
Performing LODOCV on top 10 models...
Code
# Get unique deploymentsdeployments <-unique(model_data$deployment_id)n_deployments <-length(deployments)# Initialize results storagecv_results <-tibble()# Cross-validation functionperform_lodocv <-function(model_name) { model_formula <-formula(models[[model_name]]$gam) cv_predictions <-tibble()for (dep in deployments) {# Split data train_data <- model_data %>%filter(deployment_id != dep) test_data <- model_data %>%filter(deployment_id == dep)if (nrow(test_data) ==0) next# Refit model on training datatryCatch( { cv_model <-gamm( model_formula,data = train_data,random = random_structure,correlation = ar1_cor,method ="REML" )# Predict on test data preds <-predict(cv_model$gam, newdata = test_data, type ="response")# Store results cv_predictions <-bind_rows( cv_predictions,tibble(deployment_id = dep,observed = test_data$butterfly_diff_sqrt,predicted = preds ) ) },error =function(e) {# Skip if model failsNULL } ) }if (nrow(cv_predictions) ==0) {return(tibble(model = model_name, rmse =NA, mae =NA, r2 =NA, n_pred =0)) }# Calculate prediction metrics rmse <-sqrt(mean((cv_predictions$observed - cv_predictions$predicted)^2, na.rm =TRUE)) mae <-mean(abs(cv_predictions$observed - cv_predictions$predicted), na.rm =TRUE)# Calculate R² (correlation-based for cross-validation) r2 <-cor(cv_predictions$observed, cv_predictions$predicted, use ="complete.obs")^2return(tibble(model = model_name,rmse = rmse,mae = mae,r2 = r2,n_pred =nrow(cv_predictions) ))}# Run CV for top 10 modelsfor (m in top_10_models) {cat(sprintf("CV for %s...", m)) result <-perform_lodocv(m) cv_results <-bind_rows(cv_results, result)cat(sprintf(" RMSE=%.3f, R²=%.3f\n", result$rmse, result$r2))}
CV for M31... RMSE=7.768, R<U+00B2>=0.209
CV for M23... RMSE=8.106, R<U+00B2>=0.190
CV for M29... RMSE=8.304, R<U+00B2>=0.186
CV for M19... RMSE=13.102, R<U+00B2>=0.054
CV for M51...
RMSE=8.391, R<U+00B2>=0.159
CV for M45...
RMSE=9.127, R<U+00B2>=0.091
CV for M5... RMSE=8.335, R<U+00B2>=0.171
CV for M67...
RMSE=10.232, R<U+00B2>=0.117
CV for M43...
RMSE=8.803, R<U+00B2>=0.131
CV for M17... RMSE=8.061, R<U+00B2>=0.158
Code
# Merge with model comparisonmodel_comparison_cv <- model_comparison %>%left_join(cv_results, by ="model") %>%arrange(AICc)# Display CV results for top 10cat("\n=== CROSS-VALIDATION RESULTS (TOP 10 MODELS) ===\n\n")
cat("- If high-df models have worse CV metrics despite better AICc, they are overfitting\n\n")
- If high-df models have worse CV metrics despite better AICc, they are overfitting
12 Best Model Summary
Code
# Select best model balancing AICc and CV performance# Prioritize models with low overfitting risk if CV performance is similarbest_model_name <- model_comparison_cv %>%filter(overfitting_risk !="High"|is.na(overfitting_risk)) %>%arrange(AICc) %>%slice(1) %>%pull(model)best_model <- models[[best_model_name]]cat("=== BEST MODEL (SELECTED):", best_model_name, "===\n\n")
=== BEST MODEL (SELECTED): M31 ===
Code
cat("Selection criteria: Best AICc among models with Moderate/Low overfitting risk\n\n")
Selection criteria: Best AICc among models with Moderate/Low overfitting risk
Code
# Display model detailsbest_model_details <- model_comparison_cv %>%filter(model == best_model_name)cat("Model details:\n")
'gamm' based fit - care required with interpretation.
Checks based on working residuals may be misleading.
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(max_butterflies_t_1) 4.00 2.12 1.15 0.89
ti(wind_max_gust,sum_butterflies_direct_sun) 16.00 3.54 0.97 0.32